LONG TEST

Cost Impact Analysis of Team Data

#Global get comma sep csv and excel data read function

get_data <- function(filepath){
  
  
  #stop execution once the file is a non-character
  stopifnot("`filename` must be a character." = is.character(filepath)) 
  
  
  #Check if file is not an excel or csv
  if(!is.null(str_subset(filepath, pattern = ".xls{1}$|.xlsx{1}$|.csv{1}$", negate = TRUE)) && 
     length(str_subset(filepath, pattern = ".xls{1}$|.xlsx{1}$|.csv{1}$", negate = TRUE)) == 1 && 
     !is.na(str_subset(filepath, pattern = ".xls{1}$|.xlsx{1}$|.csv{1}$", negate = TRUE)) 
  ) {
    
    stop("not excel or csv file")
  }
  
  #Execute the function body
  if (isTRUE(stringr::str_extract(filepath, pattern = ".xls{1}$") == ".xls"))  {
    
    result <- readxl::read_xls(filepath) |> 
      janitor::clean_names() |> 
      dplyr::mutate_at(dplyr::vars(contains("date")),
                       ~as.Date(., format = "%m/%d/%Y"))
    
  } else if(isTRUE(stringr::str_extract(filepath, pattern = ".xlsx{1}$") == ".xlsx"))  {
    
    result <- readxl::read_xlsx(filepath) |> 
      janitor::clean_names() |> 
      dplyr::mutate_at(dplyr::vars(contains("date")),
                       ~as.Date(., format = "%m/%d/%Y"))
  } else {
    
    result <- readr::read_csv(filepath) |> 
      janitor::clean_names() |> 
      dplyr::mutate_at(dplyr::vars(contains("date")),
                       ~as.Date(., format = "%m/%d/%Y"))
  }
  
  
  return(result) 
  
}
# return on investment function
roi <- function(tot_savings,overall_costs){
  
  round(((tot_savings - overall_costs)/overall_costs) * 100,2)
    
}

1 Objectives

To Provide an overview of:

  1. Overall cost (only large items such as feld-m, fivetran, etc.)
  2. Cost/impact analysis of these tools
  3. Quantified impact of the three teams
  4. Quantified impact of an additional data scientist

2 Fivetran

Fivetran is an off-the-shelf tool used primarily for Data ingestion of more than 60% of our existing data sources. Fivetran has 5 pricing plans. We are currently subscribed to the Starter plan

We use fivetran to extract data from 20 source connectors of 7 types/categories as shown below.

connectors <-  tbl(con,
                 sql("

     select official_connector_name,type, availability
     from fivetran_log.connector_type

                     ")) %>% as_tibble() %>%
   #lower case + exclude
                 filter(!tolower(official_connector_name) %in% c("mailchimp", "freshdesk"))

  connectors %>%  datatable(
    extensions = 'Buttons',
    options = list(
      dom = 'Blfrtip',
      buttons = c('copy', 'csv', 'excel', 'pdf'),
      lengthMenu = list(c(10, 25, 50, -1),
                        c(10, 25, 50, "All"))
    )
  )

2.1 Cost impact analysis

For proper evaluation of cost and impact analysis of fivetran, i have formulated the questions below to guide the analysis:

  1. What are the pros and cons of fivetran in our current and future infra setup?

  2. What is the overall amount of budget (yearly contract value) and monthly spend per year?

  3. How does the monthly (actual) spend change over months for each year and what is the predicted cost for the upcoming months?

  4. What is our budget estimate for fivetran in Year 2023?

  5. What is the cost of replacing fivetran?

2.2 Pros of Fivetran

“What are the pros and cons of fivetran in our current and future infra setup?”

  • Self- Service Platform - Anyone in the team can setup and modify data sources. Less dependence on engineers and that’s why we didn’t have them from the start.

  • Ease of setup - New ingestion pipelines can be setup in less than 7mins.

  • Multiple data sources - There is a growing list of generally available and stable connectors.

  • Support & Maintenance - Maintenance is outsourced and support is based on SLA.

  • Stability - No major outage in the last 2.5 years of use and APIs are very mature and stable

  • No data loss - Even if data is lost or corrupted on our destination databases, we can always re-ingest/resync the original source data

  • Guaranteed Availability - We can even exceed planned consumption and still enjoy the services while you plan for a renewal.

2.3 Cons of Fivetran

  • Consumption-based pricing - This kind of pricing model can be tricky and moderately expensive.

  • Hard to estimate costs based on consumption - Increased sales will lead to increased monthly active rows. In 2021 we used up our annual budget 1 - 2 months month less than planned.

  • Not possible to define specific timeframes for data load - Not currently possible to reload/resync data based on specific custom time frames. You may have to reload the entire data from inception which can be a time-waster.

  • Slightly more expensive if you require more features beyond the basic (starter) plan - On a higher plan, it recomputes the data ingestion based on a higher pricing plan rate even if you only wanted one additional feature.

  • no refund /carry over of excess credits into new financial year - Fivetran implemented a new policy in late 2021 where customers will not be refunded for excess credits purchased and there will also be no carry over into a new financial year. This definitely means the estimation must be really precise to avoid financial loss.

What is the overall amount of budget (yearly contract value) and monthly spend per year?

The table below shows a breakdown of year-over-year spend on fivetran software.

 ft_year <- fivetran_estm %>% 
    #dplyr::select_all(tolower) %>% 
    mutate(year = year(date)) %>% 
    group_by(year) %>% 
    summarise(spend = sum(actual))%>% as_tibble()%>% 
                      rename("spend (€)" = "spend")

    kbl(ft_year) %>% 
    #kable_styling(bootstrap_options = "striped", full_width = F, position = "float_right")
    kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
year spend (€)
2020 6220.37
2021 21746.34
2022 47929.72

Note: Spend (€) shown for 2022 is as at Sept 2022. 

2.4 Actual Spend vs Predicted

“How does the monthly (actual) spend change over months for each year and what is the predicted cost for the upcoming years?”

Yearly Contract Value over the years has tripled, monthly spend has quadruplued largely due to growing volumes of new data sources, product portfolio and market expansions.

YCV 2022 still provides relative good spend coverage for upcoming growth plans. For 2023, we will anticipate at least 2.5X current YCV.

ggplot(fivetran_estm, aes(x=date)) + 
  geom_line(aes(y = budget), color = "gray", linetype="twodash", size = 1.45) + 
  geom_line(aes(y = actual), color="#FF5E0D", size = 1.7) +
  geom_line(data = fivetran_estm |> filter(date >= "2022-03-01"),
            aes(y = arima_model), color ="#370A5A", size = 1.45,linetype="twodash", alpha = 0.5) +
  labs(
    y = "Spend (€)",
    x = "",
    #title = " <b style= 'color:#545454;;'>Fivetran: Budget vs Expense</b>",
    title = " <b style= 'color:#545454;;'>Fivetran Starter Plan: Yearly Contract Value vs Spend</b>",
    subtitle = "<br><b style= 'color:#792DC5;'>YCV over the years has trippled, </b><b style= 'color:#FF5E0D;'>monthly spend has quadrupled largely due to growing vol of new data sources and market expansions.</b><br> 
    <b style= 'color:#6a6d6f;;'>YCV 2022 still provides relative spend coverage for upcoming growth plans, for 2023, we will anticipate at least 2X current YCV.</b><br><br>"
    
    # caption = paste("Data accessibility by Etoma Egot",
    #                 "Data source: {qualtrics}",
    #                 sep = "\n")
    # added a vertical line to indicate beginning of the dollar pricing model
  ) +
  geom_vline(data = fivetran_estm,
             mapping = aes(xintercept= as.Date("2022-02-01")),
             size = 1, colour = "black", linetype = "dashed", alpha = 0.5) +
  
  #adding pricing model
  annotate(
    geom = "text",
    x = as.Date("2021-10-01"),
    y = 11000, hjust = 0, vjust = 0.5,
    label = "credit-based\npricing model",
    family = "Roboto",
    size = 5.0
  )+
  
  annotate(
    geom = "text",
    x = as.Date("2022-03-01"),
    y = 11000, hjust = 0, vjust = 0.5,
    label = "dollar-based\npricing model",
    family = "Roboto",
    size = 5.0
    #colour = '#5F249F'
  )+
  
  #setting a limit and adding € to the y-axis values
  scale_y_continuous(
    breaks = seq(2000, 12000, 2000),
    limits = c(700, 12000),
    label = function(x) {
      return(paste(x, "€"))
    }
  )+
  
  # adding yearly contract value to the plot  
  geom_point(data = df_date,
             mapping = aes(x = mdate, y =tcv_label,
                           size =15, alpha = 0.5,
                           colour = year),
  ) +
  #scale the size of the circles
  scale_size_area(max_size =11, guide = FALSE) + 
  
  #manually specify the circle colours on the plot
  scale_color_manual(values = c("#792DC5","#792DC5","#792DC5","#5b5e5f"))+
  
  #labelling the total spend per year for the bubbles
  geom_text(
    aes(x = mdate, y= tcv_label_ext, colour = year,
        label = paste(paste("YCV",year),tcv, sep = "\n")),
    data = df_date,
    size = 3.5,
    vjust = 0.5,
    hjust = 0.25,
    nudge_y = -0.30,
    nudge_x = -0.50
  )+
  
  #annotating the Year 2023 planned spend
  annotate(
    geom = "text",
    x = as.Date("2020-12-31"),
    y = 10500, hjust = 0.3, vjust = 0.5,
    label = "For 2023, we are projecting\nmore than 2X the YCV 2022",
    family = "Roboto",
    size = 4.0,
    color = "#5b5e5f"
  ) +
  
  #annotating the arima model line
  annotate(
    geom = "text",
    x = as.Date("2022-10-01"),
    y = 6700, hjust = 0, vjust = 0.5,
    label = "Arima Time Series\nmodel predictions",
    family = "Roboto",
    size = 4.0,
    color = "#370A5A"
  ) +
  
  ####legends - Budget
  geom_text(
    data = fivetran_events |>
      filter(date == "2020-08-07" & events == "Budget"),
    mapping = aes(x = date, y = budget, label = events),
    size = 3.5, vjust = 0.5, hjust = 0, colour = "#979c9e") +
  
  ####legends - Spend
  geom_text(
    data = fivetran_events |>
      filter(date == "2020-08-07" & events == "Spend"),
    mapping = aes(x = date, y = budget, label = events),
    size = 3.5, vjust = 0.5, hjust = 0, colour = "#FF5E0D") +
  #   
  # ####all events that are live 
  geom_text(
    data = fivetran_events |>
      filter(!events %in% c("Steel Bottle Launch","IE Launch","DK/CZ Launch","Black Friday22","Budget","Spend")) ,
    mapping = aes(x = date, y = budget, label = events),
    size = 3.5, vjust = 0.5, hjust = 0, colour = "#cc4931") +
  
  #   ####all events excepts those that are live 
  geom_text(
    data = fivetran_events |>
      filter(events %in% c("Black Friday22")) ,
    mapping = aes(x = date, y = budget, label = events),
    size = 3.5, vjust = 0.5, hjust = 0, colour = "#979c9e") +
  
  scale_x_date(date_breaks = "4 month",
               date_labels = "%b %Y",
               limits = c(as.Date("2020-08-01"), NA)) +
  
  theme_WD() +
  theme (legend.position = "none",
         plot.title.position = "plot",
         plot.margin = margin(t = 15, r = 15, l=5),
         plot.title = ggtext::element_markdown(),
         plot.subtitle = element_markdown(lineheight = 1.2, size = 14),
         panel.grid.major.y = ggplot2::element_blank(),
         #plot.caption = ggtext::element_markdown(),
         axis.title.x = ggtext::element_markdown(size = 3),
         axis.title.y = ggtext::element_markdown(size = 12, hjust = 0.9),
         strip.text = element_text(size = 14, color = "#A8A7A7"),
         strip.background = element_rect(color = "#474747", fill = "#474747"),
         #axis.text = element_text(family = "Roboto", color = "#161616", size=15),
         axis.text = element_text(family = "Roboto", color = "#474747", size=11)
  )  

What is the budget estimate for fivetran in Year 2023?

With plans to launch six new markets and also grow our product portfolio. We are projecting 2.5X the yearly contract value of 2022. Ballpark figure of around 120K € under the current starter plan.

If we will use fivetran for more real-time data refresh e.g. 15mins refresh time, then we look 3X YCV of 2022. Shorter refresh times with fivetran will require an upgrade to fivetran’s standard plan. For more information, see pricing plans.

What is the cost of replacing fivetran with internal developments?

Realistically, we will be talking about hiring at least 10 data engineers who will have to replicate the source connectors with internally developed APIs. While these saves cost after a one-time development, it also creates a maintenance bottleneck and a constant need to be up-to-date to upgrade APIs whenever there are changes/upgrades from the source APIs.

Perhaps, a better alternative with fewer data engineers will be to introduce a parallel, competitive and less expensive data ingestion tool. The closest competitor is airbyte (open source with community contributors) but their connectors are not very stable yet. They are definitely challengers worth considering for the future.

3 FELD-M

feld-m is an external consultancy agency that provides services on different business areas. feld-m operates a time & material contract with an hourly rate of 137.5 €/hr. They provide(d) services to air up that span different areas such as:

  • Business Intellgence
  • Web Analytics
  • Data Engineering
  • Data Science
  • Project Manager & Stand ups

For proper evaluation of cost and impact analysis of feld-m, i have formulated the questions below to guide the analysis:

  1. What is the overall amount of budget and hours used and the remaining amounts for each?

  2. How does the actual cost change over each quarter of each year? and what is the predicted cost for the following 6 months?

  3. What are our future plans with FELD-M?

3.1 Overall Budget Spent

What is the overall amount of budget spent on the different service areas with FELD-M?

feld_m %>% 
  group_by(general_topic,year) %>% 
  summarise(across(contains(c("used_budget","duration")), list(total = sum),na.rm = TRUE, .names = "{col}.{fn}")) %>% 
  arrange(desc(year)) %>% 
  mutate_at(vars(contains("used_budget_percent.total")), ~round(used_budget_percent.total,2)*100) %>% 
  rename(share = "used_budget_percent.total","used_budget(€)" = "used_budget.total",
         "duration(hr)" = "duration.total") %>% 
  ungroup(general_topic) %>% 
  arrange(desc(share)) %>% 
  arrange(desc(year)) %>% 
  mutate_at(vars(contains("share")), ~paste(.,"%")) %>% 
  dplyr::relocate(-share)    %>%  datatable(
    extensions = 'Buttons',
    options = list(
      dom = 'Blfrtip',
      buttons = c('copy', 'csv', 'excel', 'pdf'),
      lengthMenu = list(c(10, 25, 50, -1),
                        c(10, 25, 50, "All"))
    )
  )

Note: used budget (€) shown for 2022 is as at Sept 2022. 

This is best illustrated visually using the slope chart below to see the trend by service area. Note: Click on the chart visuals so they pop out

For the last 3 quarters, (Q4 2021, Q1 2022, Q3 2022), Business Intelligence has been on the rise due to high demand especially triggered by capacity needs of the data democracy project which commenced in Q4 2021 and the increased meeting from sprint stand ups.

##---------------Prepping Slope Chart-------------

#Slope Graph
ggplot(data = feldm_slope, aes(x = year, y = used_budget.total, 
                               group = general_topic, color=general_topic)) +
  geom_line(size = 2) +
  geom_point(size = 5) +
  labs(
    y = "Spend (€)",
    x = "",
    title = " <b style= 'color:#545454;;'>FELD-M: YOY Used Budget by Functional Area</b>",
    subtitle = "<br><b style= 'color:#535256;'>Business Intelligence increase was due to data democracy project capacity demands (+2 internal analysts), data engineering increase was due to<br> increase in demand for custom API development, data science increase was due to capacity demands for CLV Sprints (+1 data scientist),</b><br><b style= 'color:#792DC5;'>The decrease observed for PM/StandUps was due to reduction in meeting frequency.</b><br>",
    
    caption = paste("feld-m budget usage by Etoma Egot",
                    "Data source: {Airup X FeldM Project Accounting Dashboard}",
                    sep = "\n")
    
  ) +
  scale_x_discrete(position = "top") +
  theme_WD()+
  ## Changing font size of x-axis and main title of the graph
  theme(
    #plot.title = element_text(color="black", size=15, face="bold",  hjust = 0.5 ), 
    axis.text.x = element_text(vjust = 0.5, hjust=0.5, size = 22, face="bold"),
    ## Removing grid lines
    panel.grid.major.y = element_blank(),
    panel.grid.minor.y = element_blank(),
    panel.grid.major.x = element_blank(),
    panel.grid.minor.x = element_blank(),
    ##Add markdown elements for HTML
    plot.margin = margin(t = 15, r = 15, l=5),
    plot.title = ggtext::element_markdown(),
    plot.subtitle = element_markdown(lineheight = 1.2, size = 14),
    plot.caption = ggtext::element_markdown(),
    ## Removing the legend
    legend.position = "none",
    ## Removing everything from the y axis
    axis.title.y     = element_blank(),
    axis.text.y      = element_blank(),
    ## Removing title from the x axis 
    axis.title.x     = element_blank(),
    ## Remove x & y tick marks
    axis.ticks       = element_blank()) +
  
  ##Add labels to the slope chart
  geom_text(data = feldm_slope %>% filter(year == "2021"), 
            #aes(label = paste0(general_topic, " - ", round(used_budget_abb.total,0), "K")), 
            aes(label = paste0("               ",round(used_budget_abb.total,0), "K")), 
            hjust = 1.1, 
            #fontface = "bold", 
            vjust=0.25,
            size = 6.5 
  ) +
  geom_text(data = feldm_slope %>% filter(year == "2022"), 
            aes(label = paste0( round(used_budget_abb.total,0), "K", "  ",general_topic)) , 
            hjust = -0.1, 
            fontface = "bold",
            vjust = 0.25,
            size = 6.5
  ) +
  scale_color_manual(values = c("#B5B1B9","#B5B1B9","#B5B1B9","#792DC5","#B5B1B9"))
...

Figure 3.1: …

4 Team Data

4.1 Context

While there are no absolute methods or formulae for measuring ROI for data teams. We have formulated a measurement based on established best practices for measurement of ROI. Nonetheless, we applied it in the context of our specific scenario as data team of air up.

We currently do not sell our data products neither do we charge stakeholders directly for services rendered, doing this would have made it easier to quantify impact or directly measure our contributions to revenue. It’s often easier for the stakeholders/consumers to tell the ROI story.

Data teams operate in a very peculiar way. We act in support of other teams to impact business performance, increase profits and efficiency as well as enhance the quality of business decisions based on accurate information and insights from business intelligence deliverables.

A significant value the data team offers which is difficult to quantify is the ability to centralize, transform and aggregate data in a single source of truth and programmatically slice and dice or synthesize insights from different and mostly incompatible sources into one global data model which is then visualized on a central BI platform for users to self-serve e.g. tableau accessible to every licensed user.

Hence the premise of ROI computation for a data team is most feasible when benefits is measured based on direct savings made from eliminating redundant/recurring work, time saved from acessing different platforms (e.g 10 webshops) and ultimately generating value in near-real time.

4.2 Services Offered

We provide the following services:

  • Reports/Dashboards
  • Data Models
  • Adhoc Analysis (One-off)
  • Consultancy (Provide Clarifications, trainings, program abstract ideas into reality, intellectual powerhouse)
  • Advanced Modelling (Predictions/ forecast with AI, ML, Deep Learning, hypothesis validation etc.)
  • GA/GTM Tracking (Support Market Expansions, Geolocation tracking etc)
  • Data Integration (Build and maintain custom data source consumers)
  • Data Management (Governance, Maintenance, Monitoring etc.)

4.3 Prioritization

The journey of quantifying impact starts with ensuring a proper prioritization and task impact assessment is carried out. We usually demand the following information from our stakeholders: As a direct result of this exercise, we have also changed our request forms to mandate the users to provide answers to 2 of 3 questions.

Link to company or team goals

Which company goal does it support and please explain how it supports the company goal?

Business Impact

What type of business decision would you make with each chart?

Time savings

How many hours per day/week do you spend getting these information through other mediums? or working in excel?

4.4 Impact vs Operationalization Matrix

We have summarized our value quantification process using the below impact vs operationalization matrix. As a team, our aim is to prioritize tasks that aim for immediate impact with recurring long term value.

      knitr::include_graphics(here::here("images/prioritization_matrix.png"))
...

Figure 4.1: …

     #knitr::include_graphics('images/prioritization_matrix.png') -works but not on this machine, use the top one

4.5 Quantifying Impact of Teams

Let’s start with the definition of Return on Investment (ROI)

         ROI = (Net Income from investment / cost of investment) X 100

4.6 Assumptions

  • Used 220 working days in a year (excludes weekends and vacations)
  • Hourly rate differs by role seniority. It increases,the more senior the role..
  • 1hr per day of work saved per first level employee.
  • 1.5hr minimum saved per mid level to exec because time is more precious/expensive as you go higher up. - - Time saved was positively correlated with hourly rates.
  • personnel costs computed excludes bonus, training, team dinners, reimbursements etc.
  • software costs doesn’t include cloud infrastructure license for redshift infrastructure.
  • Based on 276 employees (currently 227 registered tableau users)
  • Actual costs was used for personnel, external and software (no assumptions here)
  • All costs were based on total costs expended as at sept 2022

4.7 Computing investment Cost

The below tables contain details of cost of investment in the following areas respectively:

  • Software costs
  • External costs
  • Personnel costs
 path <- here::here("datasets/roi_breakdown.xlsx")
 crazy_sheet <- lapply(excel_sheets(path), read_excel, path = path)
 
 crazy_sheet[[1]] %>% 
   mutate_at(vars(contains(c("yr_2022_09","yr_2022"))),~as.numeric(.)) %>% 
   kbl() %>% 
    kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
Software yr_2020 yr_2021 yr_2022_09 yr_2022_forecast
Fivetran 6220 21746 47930 58523.00
DBT 0 1432 3747 6500.00
Metaplane 0 720 1757 2377.61
Tableau 0 29090 82000 90000.00
GA360 0 0 8032 24097.00
Usercentrics 0 4070 8961 8961.00
Klipfolio 1704 852 0 0.00
Dbeaver 0 0 0 384.00
  crazy_sheet[[2]] %>% 
   kbl() %>% 
    kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
year nos_of_requests no_of_employees team_age Internal Costs external_costs software_costs overall_costs
2020 40 2 0.5 62500 0 9944 72444
2021 240 5 1 169000 223650 59931 452581
2022 185 11 1.7 506250 373924 152427 1032601

4.8 Computing direct benefits (time saved)

Assuming 10 Execs/VPs, 50 Mid-level and 216 First level teamies at different hourly rates (150 €/h, 55€/h and 42€ per hr). Example, 10 Exec/VPs working 220 working days with minimum of 1.5hr per day of work saved.

   Estimated Savings = 10 X 220 X 1.0 X 150  = €330000
   
       path <- here::here("datasets/roi_breakdown.xlsx")
 crazy_sheet <- lapply(excel_sheets(path), read_excel, path = path)

    
  crazy_sheet[[3]] %>% 
   kbl() %>% 
    kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
decision_makers Nr_of_employees example hourly_rate hours_saved_per_yr_excl_weekends savings_per_yr_all_ employees
Top Level 10 CEO, COS, CMO, VP 150 3300 330000
Mid Level 50 Head Ofs 55 16500 907500
First Level 216 Team Leads, SMEs, Teamies 42 47520 1995840

4.9 Compute ROI

Recall our ROI formula,

         ROI = (Net Income from investment / cost of investment) X 100
         
         OR
         
         ROI = (Total Estimated Savings - Investments)/Investments
         
  overall_costs <- crazy_sheet[[2]] %>% 
    filter(year == max(year)) |> pull(overall_costs)

tot_savings <- crazy_sheet[[3]] |> 
               janitor::clean_names() |> 
               summarise(total = sum(savings_per_yr_all_employees)) |> pull(total)
               

#Predicted Software Cost for 2022

software_total <- crazy_sheet[[1]] |> 
               janitor::clean_names() |> 
               mutate_at(vars(contains("yr_2022_forecast")), ~as.numeric(.)) |> 
               summarise(total = sum(yr_2022_forecast)) |> pull(total)

#Compute ROI

ROI <- roi(tot_savings,overall_costs)

Hence, ROI as at Sept 2022 is 213.13%.

Predicted software costs for 2022 is €190842.61

4.10 Quantifying Impact of an additional data scientist

The value quantification in current scope is at team level and therefore not applicable to individuals. Also, as agreed, we will not need to hire this role anymore in 2023.

4.11 Conclusions on ROI

We ended up with a positive ROI in 2nd Year only (assuming -100% ROI for the 1st 6 - 9 months) based on tangibles like time savings. If we factor in the intangible elements then realistically, the overall ROI may be in the region of 3-5X (X = 213.13%). Since these assets do not depreciate with increasing employees (as recently seen with the US launch, they re-used existing reports/dashboards) or increased usage then the ROI will not only stay positive but increase in value for Year 3 and beyond as long as our business model remains the same.

There are very many specific use cases of significant ROI generated for specific teams who spent 3-6 hrs daily on redundant work which has now been eliminated after integration of data source requirements and subsequently creating automated self-serve reports on tableau.

With anticipated reduction/elimination of external costs in 2023, the ROI will continue its linear growth.